home *** CD-ROM | disk | FTP | other *** search
- 40Hex Number 6 Volume 2 Issue 2 File 00B
-
- ------------------------------
- SCAN STRINGS, HOW THEY WORK,
- AND HOW TO AVOID THEM
- ------------------------------
- By Dark Angel
- ------------------------------
-
- Scan strings are the scourge of the virus author and the friend of anti-
- virus wanna-bes. The virus author must find encryption techniques which
- can successfully evade easy detection. This article will show you several
- such techniques.
-
- Scan strings, as you are well aware, are a collection of bytes which an
- anti-viral product uses to identify a virus. The important thing to keep
- in mind is that these scan strings represent actual code and can NEVER
- contain code which could occur in a "normal" program. The trick is to use
- this to your advantage.
-
- When a scanner checks a file for a virus, it searches for the scan string
- which could be located ANYWHERE IN THE FILE. The scanner doesn't care
- where it is. Thus, a file which consists solely of the scan string and
- nothing else would be detected as infected by a virus. A scanner is
- basically an overblown "hex searcher" looking for 1000 signatures.
- Interesting, but there's not much you can do to exploit this. The only
- thing you can do is to write code so generic that it could be located in
- any program (by chance). Try creating a file with the following debug
- script and scanning it. This demonstrates the fact that the scan string
- may be located at any position in the file.
-
- ---------------------------------------------------------------------------
-
- n marauder.com
- e 0100 E8 00 00 5E 81 EE 0E 01 E8 05 00 E9
-
- rcx
- 000C
- w
- q
-
- ---------------------------------------------------------------------------
-
- Although scanners normally search for decryption/encryption routines, in
- Marauder's case, SCAN looks for the "setup" portion of the code, i.e.
- setting up BP (to the "delta offset"), calling the decryption routine, and
- finally jumping to program code.
-
- What you CAN do is to either minimise the scannable code or to have the
- code constantly mutate into something different. The reasons are readily
- apparent.
-
- The simplest technique is having multiple encryption engines. A virus
- utilising this technique has a database of encryption/decryption engines
- and uses a random one each time it infects. For example, there could be
- various forms of XOR encryption or perhaps another form of mathematical
- encryption. The trick is to simply replace the code for the encryption
- routine each time with the new encryption routine.
-
- Mark Washburn used this in his V2PX series of virii. In it, he used six
- different encryption/decryption algorithms, and some mutations are
- impossible to detect with a mere scan string. More on those later.
-
- Recently, there has been talk of the so-called MTE, or mutating engine,
- from Bulgaria (where else?). It utilises the multiple encryption engine
- technique. Pogue Mahone used the MTE and it took McAfee several days to
- find a scan string. Vesselin Bontchev, the McAfee-wanna-be of Bulgaria,
- marvelled the engineering of this engine. It is distributed as an OBJ file
- designed to be able to be linked into any virus. Supposedly, SCANV89 will
- be able to detect any virus using the encryption engine, so it is worthless
- except for those who have an academic interest in such matters (such as
- virus authors).
-
- However, there is a serious limitation to the multiple encryption
- technique, namely that scan strings may still be found. However, scan
- strings must be isolated for each different encryption mechanism. An
- additional benefit is the possibility that the antivirus software
- developers will miss some of the encryption mechanisms so not all the
- strains of the virus will be caught by the scanner.
-
- Now we get to a much better (and sort of obvious) method: minimising scan
- code length. There are several viable techniques which may be used, but I
- shall discuss but three of them.
-
- The one mentioned before which Mark Washburn used in V2P6 was interesting.
- He first filled the space to be filled in with the encryption mechanism
- with dummy one byte op-codes such as CLC, STC, etc. As you can see, the
- flag manipulation op-codes were exploited. Next, he randomly placed the
- parts of his encryption mechanism in parts of this buffer, i.e. the gaps
- between the "real" instructions were filled in with random dummy op-codes.
- In this manner, no generic scan string could be located for this encryption
- mechanism of this virus. However, the disadvantage of this method is the
- sheer size of the code necessary to perform the encryption.
-
- A second method is much simpler than this and possibly just as effective.
- To minimise scan code length, all you have to do is change certain bytes at
- various intervals. The best way to do this can be explained with the
- following code fragment:
-
- mov si, 1234h ; Starting location of encryption
- mov cx, 1234h ; Virus size / 2 + variable number
- loop_thing:
- xor word ptr cs:[si], 1234h ; Decrypt the value
- add si, 2
- loop loop_thing
-
- In this code fragment, all the values which can be changed are set to 1234h
- for the sake of clarity. Upon infection, all you have to do is to set
- these variable values to whatever is appropriate for the file. For
- example, mov bx, 1234h would have to be changed to have the encryption
- start at the wherever the virus would be loaded into memory (huh?). Ponder
- this for a few moments and all shall become clear. To substitute new
- values into the code, all you have to do is something akin to:
-
- mov [bp+scratch+1], cx
-
- Where scratch is an instruction. The exact value to add to scratch depends
- on the coding of the op-code. Some op-codes take their argument as the
- second byte, others take the third. Regardless, it will take some
- tinkering before it is perfect. In the above case, the "permanent" code is
- limited to under five or six bytes. Additionally, these five or six bytes
- could theoretically occur in ANY PROGRAM WHATSOEVER, so it would not be
- prudent for scanners to search for these strings. However, scanners often
- use scan strings with wild-card-ish scan string characters, so it is still
- possible for a scan string to be found.
-
- The important thing to keep in mind when using this method is that it is
- best for the virus to use separate encryption and decryption engines. In
- this manner, shorter decryption routines may be found and thus shorter scan
- strings will be needed. In any case, using separate encryption and
- decryption engines increases the size of the code by at most 50 bytes.
-
- The last method detailed is theft of decryption engines. Several shareware
- products utilise decryption engines in their programs to prevent simple
- "cracks" of their products. This is, of course, not a deterrent to any
- programmer worth his salt, but it is useful for virus authors. If you
- combine the method above with this technique, the scan string would
- identify the product as being infected with the virus, which is a) bad PR
- for the company and b) unsuitable for use as a scan string. This technique
- requires virtually no effort, as the decryption engine is already written
- for you by some unsuspecting PD programmer.
-
- All the methods described are viable scan string avoidance techniques
- suitable for use in any virus. After a few practice tries, scan string
- avoidance should become second nature and will help tremendously in
- prolonging the effective life of your virus in the wild.
-